perm filename RPT.RLL[RDG,DBL] blob
sn#537643 filedate 1980-09-25 generic text, type T, neo UTF8
Report of the ESW Oil Spill Effort
RLL: A Representation Language Language
Doug Lenat and Russ Greiner
Unlike most groups, we (Lenat and Greiner) focused on the entire spill
crisis treatment scenario, and paid only slight extra attention to the
subproblems of Discovery (initial intake interview) and Source Location
(by backtracking or by indirect analysis). In fact, we considered OTHER
problems, such as locating an escaped convict (where the unwanted material
is spilled onto conduits (roads) and must be located, etc.) Whenever any
piece of knowledge was added to RLL, the question we invariably posed was:
can this be generalized or abstracted in some way, and still retain its
potency, its power for constraining search? Most of the knowledge we have
so far represented within RLL is common to both the convict and the oil
spill problems, and is represented in a manner usable by the system in
either context. Of course there are individual differences in technology,
such as road blocks instead of absorbent booms, but those differences are
at much lower a level (e.g., terminology) than most inference processes
deal with. This kind of generality is one of the major powers of RLL --
and, due to the effort required to exercise it, one of the greatest
liabilities when a constraint is to have a running system in two days. As
we hope RLL's mechanisms will eventually be widely used, we are attempting
to enter the information -- whether data, or constrol structure -- in as
unbiased and extensible a manner as possible. We chose to sacrifice
"performing a flashy demo" for "representing things the right way"; toward
the end, we had to sacrifice both of them to get even a meager demo
running.
One of the early exercises we performed was to hand-simulate a dialogue
with the system. It became clear that we would have to choose a "role"
for the system to play. We noted that the greatest need was during the
night, when nightshift workers who were ill-equipped to deal with spills
nevertheless had to. Thus, our model is one where a spill is encountered,
called in to the program, and the latter then directs the activities of
the discoverer, sends out other teams, notifies various authorities, etc.
Thus the role is one of REPLACING the expert in this process. We believe
that almost all the information can, however, also be used for tutorial
purposes, for advising an expert, etc. This is one reason for
representing each piece of knowledge EXPLICITLY, rather than burying it
within a piece of code.
As our simulation continued, we observed that there was frequent need to
"suspend" one of the major tasks we had begun, to attend to some new
datum, some new conclusion with dramatic consequences ("Don't breathe that
stuff!"), or simply because the current task seemed to be bogging down.
The control structure which this type of interaction suggested (to us) is
an AGENDA of tasks, very much like the agenda of AM. Each task would have
some priority rating, and when selected would fire production-like rules,
until it was satisfied or until its quantum of cpu time expired (in which
case it would be suspended). During the firing of a rule, it could direct
the rpogram to add new tasks to the agenda, modify the data base, ask the
user for some information, tell him some, etc.
While we have worked on RLL for some time, we had not (until this probem)
implemented this type of control structure; hence our first major task was
to describe it to RLL. (This meant encoding it as a collection of units
including rules, tasks, priorities, special values returnable by rules and
by tasks, etc.) These new control-related units were entered into one of
our permanent system knowledge bases (EURISKO), rather than on the new one
we had created for this task (SPILL), because of the future utility of the
agenda mechanism.
The second "lack" we felt in the then-extant RLL system was the notion of
gradual restriction (corresponding to the SPEC relation, defined in the
MOLGEN UNITS package [Stefik]). In particular, we needed to deal with
generic events, whose descendants could become gradually more specialized,
instantiated, particularized. We added the units for events in general
and pipe breaks, flows, etc. in particular. We also added units
describing the type of gradual restriction we wanted to have connecting
events. We represented several kinds of connections between events,
several kinds of slots that were new to RLL: MoreGeneralEvents,
CausesOtherEvents, CausedBy, PriorEvents, MoreSpecializedEvents,
LaterEvents, SimultEvents, etc.
The third thing we noticed was that RLL had no notion of a Problem. It
has previously been used only on open-ended types of tasks, never those
admitting a precise answer or solution. Units for these concepts had to
be added.
Finally, we began to enter units for concepts which had at least SOMETHING
to do with the target task: liquids, chemicals (and oils and acids in
particular), pH, flows, containers, mixings, etc. At this level of
abstraction, none of this was specific to the particular problem given.
The incorporation of the above units took two days of part-time work;
probably 25 man-hours in all. (Much additional time was spent fixing up
RLL: In addition to fleshing out many skeletons, like the agenda mechanism
mentioned above, there were a host of low level bugs which had to be
fixed.) Before this task was completed, we had sketched out how we would
represent such task-specific details as the White Oak Creek drainage
system, the four major pieces of legislation which define the possible
violations, the particular counter-measures which can be taken to halt the
flow of oil or acid, etc. Units for some of these have been entered. The
final type of problem-specific knowledge which we had to enter, to get RLL
"running" was the set of rules which manage the various phases of the
spill crisis management problem. These ranged from trivial
information-requesting rules (If the discoverer's name isn't known, ask
it) to judgmental rules for counter-measures (If the flow is to be stopped
at a Weir, then use a skimmer). Not all of these have been added, and as a
result the "demos" produced by the system are incomplete. Essentially, we
began entering task rules on Tuesday night -- into a system which was only
then at the stage most other groups had on Saturday. Because most of the
preliminary knowledge was represented in a reasonable way, it will be
usable in the future. It is important to realize that RLL itself was
alterred. (We are NOT including the removal of various bugs in this
category.) In addition to the SPILL-related specific facts just enterred,
RLL now better understands agndae, generic objects, and control
mechanisms. As these will remanin in RLL, it will be considerably easier
to implement subsequent applications which are "close" to this one.
The details of our small implementation can best be apprehended from the
figures, traces, knowledge bases, etc. which accompany this document.
Some simple consultations (dialogues) have been run through, including
directing the user in a backtrack search to locate the source of the
spill.
Note in particular the manner in which one task starts (interview
discoverer) but spends only a few seconds on it. Some of the rules
associated with acheiving that task are fired (getting the spill type and
location), but many are not (getting the discoverer's department address).
Of higher priority is a preliminary identification of the material which
has spilled, and so the Discovery task is suspended and the
material-characterization task is chosen to run. After a preliminary ID
is made (oil, acid, perhaps one level more detail, but NOT the precise
chemical composition or trade name of it), that task too is suspended.
The highest priority then is Evaluating potential hazards. Thus, within
about 10cpu seconds, RLL has formed a tentative picture of what spilled,
where, and how dangerous it is. Gradually, that picture is fleshed out,
as more tasks are executed, and as suspended tasks are resumed and worked
on some more. The power of the agenda is in allowing any "high priority"
rule to trigger at almost any time.
The versatility and adaptability of this agenda mechanism, together with
later general utility of the knowledge, are the major strengths of this
implementation. Similar flexibility can be found in the RLL language
itself. To understand its mallability, one has to consider the range of
things which the RLL user may regard as "parameters" -- i.e. what he is
allowed to specify, as opposed to finding hardwired in.
Each of the expert building systems has a different idea of what qualifies
as domain specific information (that is, what the user should be expected
to enter). For example, none of these ESBSs (expert system building
systems) would be expected to know, a priori, the specifics of this
particular plant, such as "Pipe90 connects to Pipe82" or "All permanent
storage tanks are diked". Similarly, none of these systems would have
facts at one higher level -- for example, information about chemistry,
(e.g. Oil#33 is corrosive) or connectivity, (e.g. that each pipe will flow
into some other pipe, unless it leaks) - built in. As such, information
in both categories would have to be enterred.
RLL goes one step further, by allowing the user to specify what control
regime to use as well. This does NOT imply this information must be
enterred in LISP code, anymore than the other facts (pertaining, for
example, to acids or dikes,) had to be given in so low level a manner.
RLL first includes a set of known mechanisms, (eg BackWard Chaining Rules,
or Agenda), from which the user may conveniently select the one he wishes.
In addition, RLL provides a collection of tools, which the user can use to
construct his own new control regimes, if necessary. These tools describe
the control information in high level, natural terms.
As for the weaknesses, one of the most obvious ones is the extra cost of
getting this system running: we can't assert that Pipe3 flows-into Pipe4
without first creating a unit for the relation flows-into, explaining that
that isa Slot, that it is meaningful for any two conduits, etc., etc.,
etc. One thing that might be expected to be a weakness is the apparent
inefficiency this high degree of "interpretiveness" implies. To the
contrary, this is one of RLL's big strengths: see [Lenat, Hayes-Roth, and
Klahr] for details of how caching and other techniques recapture the
efficiency that would otherwise be lost. Admittedly, the FIRST time you
ask RLL to do something, it takes a LONG time, but from then on a similar
type of request will return fairly quickly. One severe weakness is the
absence of a front-end; the user must build his system by editing units,
rather than the nice human-engineered dialogue he can have with EMYCIN,
e.g. The final SYSTEM produced, however, can have a simple user interface
(and in fact this is one reason we had the ROLE of our system be that of
the expert -- it could have the initiative almost exclusively, and simply
ask questions of the user).
In this experiment, we have been forced to the realization that, for a
small amount of time, a simpler language (such as EMYCIN or LISP) is able
to achieve SOME results more quickly. Some of the goals of RLL, which
include its aiding of the user in producing an expert system, just aren't
in existence yet. The experience has also reinforced our view of the
process of building an expert system as an incremental approach to
competence. Innumerable times, compromises have to be made, sacrifices of
"the right way" to the altar of "getting started". We have honed our
abilities to make such sacrifices (one of the requisites of a C.K.E.), and
have honed our facilities to make them in a way that does not preclude
redoing things in a better way later (another CKE requisite). To close
with one example of this process, our original design had four Violation
rules, one for each piece of legislation; later, as we learned more about
the complexities of those regulations, we realized the necessity of
replacing those four rules with four separate tasks, each of which had
several rules attached. This kind of flexibility, which is admittedly
just beginning in RLL, is the cornerstone of successful KEing.